Sampled fictitious play for multi-action stochastic dynamic programs
نویسندگان
چکیده
We introduce a class of finite-horizon dynamic optimization problems that we call multiaction stochastic dynamic programs (DPs). Their distinguishing feature is that the decision in each state is a multi-dimensional vector. These problems can in principle be solved using Bellman’s backward recursion. However, complexity of this procedure grows exponentially in the dimension of the decision vectors. This is called the curse of action-space dimensionality. To overcome this computational challenge, we propose an approximation algorithm rooted in the game theoretic paradigm of Sampled Fictitious Play (SFP). SFP solves a sequence of DPs with a one-dimensional action-space, which are exponentially smaller than the original multi-action stochastic DP. In particular, the computational effort in a fixed number of SFP iterations is linear in the dimension of the decision vectors. We show that the sequence of SFP iterates converges to a local optimum, and present a numerical case study in manufacturing where SFP is able to find solutions with objective values within 1% of the optimal objective value hundreds of times faster than the time taken by backward recursion. In this case study, SFP solutions are also better by a statistically significant margin than those found by a one-step lookahead heuristic. ∗Corresponding author. Industrial and Systems Engineering, Box 352650, The University of Washington, Seattle, WA 98195. Email: [email protected].
منابع مشابه
Sampled fictitious play for approximate dynamic programming
Sampled Fictitious Play (SFP) is a recently proposed iterative learning mechanism for computing Nash equilibria of non-cooperative games. For games of identical interests, every limit point of the sequence of mixed strategies induced by the empirical frequencies of best response actions that players in SFP play is a Nash equilibrium. Because discrete optimization problems can be viewed as games...
متن کاملA Computationally Efficient Implementation of Fictitious Play for Large-Scale Games
The paper is concerned with distributed learning and optimization in large-scale settings. The wellknown Fictitious Play (FP) algorithm has been shown to achieve Nash equilibrium learning in certain classes of multi-agent games. However, FP can be computationally difficult to implement when the number of players is large. Sampled FP is a variant of FP that mitigates the computational difficulti...
متن کاملSampled Fictitious Play for Black-Box Stochastic Sequential Decision Problems
In this paper, we propose an algorithm based on Sampled Fictitious Play for solving finitehorizon stochastic sequential decision problems. Our method models the decision problem as a game of identical interest between multiple players, who use the history of their past plays to improve the estimate of optimal reward in the initial state. We show that this method is able to find an optimal polic...
متن کاملStochastic fictitious play with continuous action sets
Continuous action space games form a natural extension to normal form games with finite action sets. However, whilst learning dynamics in normal form games are now well studied, it is not until recently that their continuous action space counterparts have been examined. We extend stochastic fictitious play to the continuous action space framework. In normal form games the limiting behaviour of ...
متن کاملSampled Fictitious Play is Hannan Consistent
Fictitious play is a simple and widely studied adaptive heuristic for playing repeated games. It is well known that fictitious play fails to be Hannan consistent. Several variants of fictitious play including regret matching, generalized regret matching and smooth fictitious play, are known to be Hannan consistent. In this note, we consider sampled fictitious play: at each round, the player sam...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013